Searching for Regularities in Weighted Sequences

نویسندگان

  • M. Christodoulakis
  • C. Iliopoulos
  • K. Tsichlas
چکیده

In this paper we describe algorithms for finding regularities in weighted sequences. A weighted sequence is a sequence of symbols drawn from an alphabet Σ that have a prespecified probability of occurrence. We show that known algorithms for finding repeats in solid sequences may fail to do so for weighted sequences. In particular, we show that Crochemore’s algorithm for finding repetitions cannot be applied in the case of weighted sequences. However, one can use Karp’s algorithm to identify repeats of specific length. We also extend this algorithm to identify the covers of a weighted sequence. Finally, the implementation of Karp’s algorithm brings up some very interesting issues.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computation of Repetitions and Regularities of Biologically Weighted Sequences

Biological weighted sequences are used extensively in molecular biology as profiles for protein families, in the representation of binding sites and often for the representation of sequences produced by a shotgun sequencing strategy. In this paper, we address three fundamental problems in the area of biologically weighted sequences: (i) computation of repetitions, (ii) pattern matching, and (ii...

متن کامل

Varieties of Regularities in Weighted Sequences

A weighted sequence is a string in which a set of characters may appear at each position with respective probabilities of occurrence. A common task is to identify repetitive motifs in weighted sequences, with presence probability not less than a given threshold. We consider the problems of finding varieties of regularities in a weighted sequence. Based on the algorithms for computing all the re...

متن کامل

Computer system "Gene Discovery" for promoter structure analysis

This paper presents implementation of Data Mining and Knowledge Discovery techniques for searching for regularities in tables of context features of DNA sequences involved in regulation of transcription. The goal is to discover regularities that relate nucleotide sequences to the functional classes of these sequences. The search patterns for regularities have been constructed in the first-order...

متن کامل

I-45: Advance MRI Sequences in Pelvic Endometriosis

Background: To assess MRI in diagnosing endometriotic lesions, emphasizing T2*weighted imaging efficacy. Materials and Methods: This prospective study of 48 females (22-38 years, average 29.6) clinically suspected of endometriosis from September 2009 to April 2012. MRI was performed with a 1.5 T imager (Siemens) with a body array coil. T1, T2 and T2* weighted (2D-FLASH) sequences were obtained ...

متن کامل

Searching the genome of beluga(Husohuso) for sex markers based on targeted Bulked SegregantAnalysis (BSA)

In sturgeon aquaculture, where the main purpose is caviar production, a reliable method is needed to separate fish according to gender. Currently, due to the lack of external sexual dimorphism, the fish are sexed by an invasive surgical examination of the gonads. Development of a non-invasive procedure for sexing fish based on genetic markers is of special interest. In the present study we empl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005